Method and apparatus for identifying network attacks

ABSTRACT

Embodiments of the present disclose provide a method and apparatus for identifying network attacks. The method can include: acquiring access data within at least two time periods of a target website server, wherein the access data include one or more fields; determining, for each of the at least two time periods, a quantity of access data having same content in at least two of the one or more fields; determining whether the quantities of access data for each of the at least two time periods are the same; and in response to the quantities of access data being the same, determining that at least two access requests of the access data are network attacks.

CROSS REFERENCE TO RELATED APPLICATION

The disclosure claims the benefits of priority to Internationalapplication number PCT/CN2016/105286, filed Nov. 10, 2016 and Chineseapplication number 201510802440.3, filed Nov. 19, 2015, both of whichare incorporated herein by reference in their entities.

BACKGROUND

With the constant development of Internet technologies, there areincreasingly more attacks against website servers, thus bringing outmany adverse influences on the website servers.

At present, there are two types of manners of attacks against websiteservers. The first type is large-traffic network attacks, featured withshort attack time but large traffic (for example, a network beingattacked 10,000 times in 1 second), such as Denial of Service (DoS) andDistributed Denial of Service (DDoS). The large-traffic network attacksconsume resources of a target website server by sending a large quantityof access requests to the target website server within a short timeperiod. As a result, the target server can be overloaded and the networkcan be blocked, thus affecting normal access to a website.

The second type is small-traffic network attacks, featured with longattack duration but small traffic (for example, a network being attackedcontinuously for one day but 10 times per second). The small-trafficnetwork attacks scan resources of a target website server by sending asmall quantity of access requests to the target website server for along time. Although only a small part of resources is scanned within ashort time period, all the resources in the target website server can beobtained by scanning after a long period of time. This will undoubtedlycause security problems to the target website server. For example, anattacker may find a vulnerability in the server from the resources.

Conventionally, a method as shown in FIG. 1 can be adopted forlarge-traffic attacks.

In step S100, network access data packets of a target website server areacquired, and source Internet Protocol (IP) addresses of the networkaccess data packets are parsed out from the network access data packets.

In step S110, a first quantity of identical source IP addresses iscounted within a predetermined time period.

In step S120, network accesses including the source IP addresses aredetermined as network attacks if the first quantity reaches apredetermined threshold.

With the method, network access data packets are parsed to obtain IPaddresses, a first quantity of identical IP addresses accessing a targetwebsite server is counted within a predetermined time period, andnetwork accesses including the IP addresses are considered as networkattacks if the first quantity reaches a predetermined threshold.

However, for small-traffic network attacks, the quantity of accessrequests within a preset time period is not large. The quantity ofaccess requests for small-traffic network attacks is similar to thequantity of many normal access requests. And a first quantity of thesmall-traffic network attacks obtained with the foregoing method isalways less than a predetermined threshold. Therefore, the accessrequests cannot be determined as network attacks through the foregoingmethod. It is easy to determine normal access requests as networkattacks if the predetermined threshold is set to be too small in theforegoing method.

Therefore, small-traffic network attacks cannot be identifiedconventionally.

SUMMARY OF THE DISCLOSURE

Embodiment of the present application provide a method for identifyingnetwork attacks. The method can include: acquiring access data for atleast two time periods of a target website server, wherein the accessdata includes one or more fields; determining, for each of the at leasttwo time periods, a quantity of access data having same content in atleast two of the one or more fields; and determining at least two accessrequests of the access data are network attacks based on the quantitiesof access data for each of the at least two time periods.

Embodiments of the application further provide an apparatus foridentifying network attacks. The apparatus can include: an acquisitionunit configured to acquire access data for at least two time periods ofa target website server, wherein the access data includes one or morefields; a first determination unit configured to determine, for each ofthe at least two time periods, a quantity of access data having a samecontent in the at least two of the one or more fields; and a seconddetermination unit configured to determine at least two access requestsof the access data are network attacks based on the quantities of accessdata for each of the at least two time periods.

Embodiments of the application further provide a non-transitory computerreadable medium that stores a set of instructions that is executable byat least one processor of an electronic device to cause the device toperform a method for identifying network attacks. The method caninclude: acquiring access data for at least two time periods of a targetwebsite server, wherein the access data includes one or more fields;determining, for each of the at least two time periods, a quantity ofaccess data having same content in at least two of the one or morefields; and determining at least two access requests of the access dataare network attacks based on the quantities of access data for each ofthe at least two time periods.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to illustrate the technical solutions according to theembodiments of the present application or in the prior art more clearly,the accompanying drawings for describing the embodiments or the priorart are introduced briefly in the following. It is apparent that theaccompanying drawings in the following description are only someembodiments of the present application. Those of ordinary skill in theart can obtain other drawings according to the accompanying drawingswithout any creative efforts.

FIG. 1 is a flowchart of a method for identifying network attacks.

FIG. 2 is a flowchart of a method for identifying network attacksaccording to embodiments of the present application.

FIG. 3 is a flowchart of a method for identifying network attacksaccording to embodiments of the present application.

FIG. 4 is a flowchart of a method for identifying network attacksaccording to embodiments of the present application.

FIG. 5 is a flowchart of a method for identifying network attacksaccording to embodiments of the present application.

FIG. 6 is a schematic diagram of an apparatus for identifying networkattacks according to embodiments of the present application.

DETAILED DESCRIPTION

In order to make those skilled in the art better understand thetechnical solutions in the present application, the technical solutionsin the embodiments of the present application will be clearly and fullydescribed below with reference to the accompanying drawings in theembodiments of the present application. It is obvious that theembodiments to be described are only a part rather than all theembodiments of the present application. All other embodiments derived bythose of ordinary skill in the art based on the embodiments in thepresent application without creative efforts should be encompassed inthe protection scope of the present application.

FIG. 2 is a flowchart of a method for identifying network attacksaccording to embodiments of the present application. The method caninclude steps S200-S230.

In step S200, access data of a target website server within at least twotime periods can be acquired according to preset fields.

In some embodiments, the preset fields can include at least one of IPaddress, a host, a user agent, and a URL. The IP address can indicate asource IP address that accesses a target website. The host can indicatea domain name that accesses the target website. The user agent canindicate a browser that accesses the target website, such as GoogleChrome, QQ Browser, IE Browser and the like. The user agent can alsoindicate a search engine that accesses the target website, such as a webcrawler and the like. The URL (i.e., Uniform Resource Locator) canindicate an address that accesses the target website.

The time length of the time period can be preset. For example, the timelength of the time period is 1 minute. That is, access data of a targetwebsite server within at least two 1-minute time periods is acquiredaccording to preset fields.

In some embodiments, the time periods can include adjacent time periods.

The system can intercept an access request for accessing the targetwebsite server, parse the access request, for example, through a Javaclass library, and acquire content of the IP-address field in a networklayer, and content of the host field, content of the user-agent field,and content of the URL field in an application layer.

In step S210, the quantity of access data having the same content ineach of the preset fields within each of the time periods can becounted.

The following example 1 is access data of a target website server within3 time periods (1 minute) acquired according to preset fields (“Host”,“IP address”, “User Agent”).

First time period (2013 Sep. 18 17:20:00 to 2013 Sep. 18 17:21:00):

Host IP address User Agent www.aaa.com 1.1.1.1 Mozilla/5.0 www.aaa.com1.1.1.2 chrome www.aaa.com 1.1.1.3 chrome www.aaa.com 1.1.1.1Mozilla/4.0 www.aaa.com 1.1.1.1 Mozilla/5.0 www.aaa.com 1.1.1.2 chromewww.aaa.com 1.1.1.1 Mozilla/5.0

Second time period (2013 Sep. 18 17:21:00 to 2013 Sep. 18 17:22:00):

Host IP address User Agent www.aaa.com 1.1.1.1 Mozilla/4.0 www.aaa.com1.1.1.1 Mozilla/5.0 www.aaa.com 1.1.1.1 Mozilla/4.0 www.aaa.com 1.1.1.1Mozilla/5.0

Third time period (2013 Sep. 18 17:22:00 to 2013 Sep. 18 17:23:00):

Host IP address User Agent www.aaa.com 1.1.1.4 chrome www.aaa.com1.1.1.1 Mozilla/4.0 www.aaa.com 1.1.1.1 Mozilla/5.0 www.aaa.com 1.1.1.1Mozilla/4.0 www.aaa.com 1.1.1.1 Mozilla/5.0

EXAMPLE 1

For the content shown in the above example 1, after S210, the quantitiesof access data having the same content in the preset fields (the host,the IP address, the user agent) within each of the time periods arecounted, as shown in the following example 2.

Within the first time period:

Number of Host IP address User Agent accesses www.aaa.com 1.1.1.1Mozilla/5.0 2 www.aaa.com 1.1.1.1 Mozilla/4.0 2 www.aaa.com 1.1.1.2chrome 2 www.aaa.com 1.1.1.3 chrome 1

Within the second time period:

Number of Host IP address User Agent accesses www.aaa.com 1.1.1.1Mozilla/5.0 2 www.aaa.com 1.1.1.1 Mozilla/4.0 2

Within the third time period:

Number of Host IP address User Agent accesses www.aaa.com 1.1.1.1Mozilla/5.0 2 www.aaa.com 1.1.1.1 Mozilla/4.0 2 www.aaa.com 1.1.1.4chrome 1

EXAMPLE 2

For another example, the following example 3 includes access data of atarget website server within 3 time periods (1 minute) acquiredaccording to a preset field (e.g., the IP address).

First time period (2013 Sep. 9-18 17:20:00 to 2013 Sep. 18 17:21:00):

IP address

1.1.1.1

1.1.1.2

1.1.1.3

1.1.1.1

1.1.1.1

1.1.1.2

1.1.1.1

Second time period (2013 Sep. 18 17:21:00 to 2013 Sep. 18 17:22:00):

IP address

1.1.1.1

1.1.1.1

1.1.1.1

1.1.1.1

Third time period (2013 Sep. 18 17:22:00 to 2013 Sep. 18 17:23:00):

IP address

1.1.1.4

1.1.1.1

1.1.1.1

EXAMPLE 3

For the content in the above example 3, after S210, the quantity ofaccess data having the same content in the preset field “IP address”within each of the time periods is counted, as shown in the followingexample 4.

Within the first time period:

Number of IP address accesses 1.1.1.1 4 1.1.1.2 2 1.1.1.3 1

Within the second time period:

Number of IP address accesses 1.1.1.1 4

Within the third time period:

Number of IP address accesses 1.1.1.1 4 1.1.1.4 1

EXAMPLE 4

In step S220, it is determined whether the quantities of access datahaving the same preset fields within each of the time periods are thesame. If the quantities of access data having the same preset fieldswithin each of the time periods are the same, step S230 can beperformed.

As shown in the above example 4, the quantities of access data includingthe IP address of “1.1.1.1” are the same within the first time period tothe third time period, and the system performs step S230.

As shown in the above example 2, the quantities of access data having ahost of “www.aaa.com”, an IP address of “1.1.1.1”, and a user agent of“Mozilla/5.0” are the same within the first time period to the thirdtime period. And the quantities of access data having a host of“www.aaa.com”, an IP address of “1.1.1.1”, and a user agent of“Mozilla/4.0” are the same within the first time period to the thirdtime period, and the system performs step S230.

In S230, in response to the quantities of access data having the samepreset fields within each of the time periods being the same, accessrequests corresponding to access data having the same quantity can bedetermined as small-traffic network attacks.

In some embodiments, according to the features of small-traffic networkattacks, it can be determined whether the quantities of access datahaving the same preset fields within at least two time periods are thesame. If the quantities of access data having the same preset fieldswithin at least two time periods are the same, the features ofsmall-traffic network attacks are matched, and it can be determined thataccess requests corresponding to access data having the same quantityare small-traffic network attacks.

In some embodiments of the present application, as shown in FIG. 3, stepS200 may further include S201 and S202.

In step S201, an access log of the target website server can becollected.

Websites are of a multi-layer architecture in the Internet. For example,website architectures can include ngnix and tomcat. As nanix and tomcatboth have their own access logs, nanix and tomcat can record the sameaccess request in their own access logs. To avoid repeatedly collectingaccess logs, only an access log of a most front-end application iscollected when an access log of the target website server is collected.Here, nanix is a more front-end application than tomcat, and thus onlythe access log of nanix needs to be collected.

In step S202, access data within at least two time periods can beacquired from the access log according to the preset fields.

The preset fields can include at least one of the IP address, the host,the user agent, and the URL. The IP address can indicate a source IPaddress that accesses a target website. The host can indicate a domainname that accesses the target website. The user agent can indicate abrowser that accesses the target website, such as Google Chrome, QQBrowser, IE Browser and the like. The user agent can also indicate asearch engine that accesses the target website, such as a web crawlerand the like. The URL (i.e., Uniform Resource Locator) can indicate anaddress that accesses the target website.

In some embodiments, after collecting the access log of the targetwebsite server, the system can acquire access data through a presetinstruction. For example, the system can acquire record times in theaccess log through $time_local, acquire the recorded host through $host,acquire the IP address through $remote_addr, acquire the user agentthrough $http_user_agent, and acquire the URL through $request_uri.

For example, access data (e.g., access data since 2013 Sep. 18 17:58:00)can be recorded. And the access data can be acquired according to thepreset fields (e.g., the IP address, the host, the user agent) is shownin the following:

Record time Host IP address User Agent 2013 Sep. 18 17:20:00 www.aaa.com1.1.1.1 Mozilla/5.0 2013 Sep. 18 17:20:02 www.aaa.com 1.1.1.2 chrome2013 Sep. 18 17:20:08 www.aaa.com 1.1.1.3 chrome 2013 Sep. 18 17:20:15www.aaa.com 1.1.1.1 Mozilla/4.0 2013 Sep. 18 17:20:30 www.aaa.com1.1.1.1 Mozilla/5.0 2013 Sep. 18 17:20:33 www.aaa.com 1.1.1.2 chrome2013 Sep. 18 17:20:45 www.aaa.com 1.1.1.1 chrome 2013 Sep. 18 17:21:00www.aaa.com 1.1.1.1 Mozilla/4.0 2013 Sep. 18 17:21:15 www.aaa.com1.1.1.1 Mozilla/5.0 2013 Sep. 18 17:21:30 www.aaa.com 1.1.1.1Mozilla/4.0 2013 Sep. 18 17:21:45 www.aaa.com 1.1.1.1 Mozilla/5.0 2013Sep. 18 17:22:00 www.aaa.com 1.1.1.4 chrome 2013 Sep. 18 17:22:00www.aaa.com 1.1.1.1 Mozilla/4.0 2013 Sep. 18 17:22:15 www.aaa.com1.1.1.1 Mozilla/5.0 2013 Sep. 18 17:22:30 www.aaa.com 1.1.1.1Mozilla/4.0 2013 Sep. 18 17:22:45 www.aaa.com 1.1.1.1 Mozilla/5.0

In some embodiments, the time periods include adjacent time periods. Thetime length of the time period can be preset. For example, the timelength of a time period can be preset as one minute, and access datawithin three time periods as shown below can be acquired.

First time period (2013 Sep. 18 17:20:00 to 2013 Sep. 18 17:21:00):

Host IP address User Agent www.aaa.com 1.1.1.1 Mozilla/5.0 www.aaa.com1.1.1.2 chrome www.aaa.com 1.1.1.3 chrome www.aaa.com 1.1.1.1Mozilla/4.0 www.aaa.com 1.1.1.1 Mozilla/5.0 www.aaa.com 1.1.1.2 chromewww.aaa.com 1.1.1.1 Mozilla/5.0

Second time period (2013 Sep. 18 17:21:00 to 2013 Sep. 18 17:22:00):

Host IP address User Agent www.aaa.com 1.1.1.1 Mozilla/4.0 www.aaa.com1.1.1.1 Mozilla/5.0 www.aaa.com 1.1.1.1 Mozilla/4.0 www.aaa.com 1.1.1.1Mozilla/5.0

Third time period (2013 Sep. 18 17:22:00 to 2013 Sep. 18 17:23:00):

Host IP address User Agent www.aaa.com 1.1.1.4 chrome www.aaa.com1.1.1.1 Mozilla/4.0 www.aaa.com 1.1.1.1 Mozilla/5.0 www.aaa.com 1.1.1.1Mozilla/4.0 www.aaa.com 1.1.1.1 Mozilla/5.0

Therefore, a larger amount of more accurate access data can be acquiredaccording to the collected access log. The obtained access data canfacilitate the system to determine, according to the features of thesmall-traffic network attacks, whether the numbers of accesses of a samepiece of access data within a preset number of time periods are thesame. If the numbers of accesses of a same piece of access data are thesame, the features of the small-traffic network attacks are matched, andit can be determined that access requests including the same piece ofaccess data are small-traffic network attacks.

A large amount of access data is generally recorded in an access log ofa website server. After the system collects the access log, the accessdata can occupy lots of system memory if not stored in a database. As aresult, the running efficiency of the system can be lowered. To solvethe foregoing problem, the method may further include, after step S202,storing the access data within the at least two time periods in adatabase. Correspondingly, the step of counting the quantity of accessdata having the same content in each of the preset fields within each ofthe time periods can includes querying the database for access datawithin at least two time periods and counting the quantity of accessdata having the same content in each of the preset fields within each ofthe time periods.

In some embodiments, the database may be part of the system or anexternal database associated with the system. The database may also beassociated with another database, to increase the storage space of thedatabase.

The step of querying the database for access data within at least twotime periods may further include querying, via the system, for theaccess data within the at least two time periods through a queryinstruction (e.g., a SQL statement).

In some embodiments, in addition to identifying small-traffic networkattacks, the system also needs to deal with large-traffic networkattacks. As discussed above, step S230 can include determining thatwhether access requests corresponding to access data having the samequantity are small-traffic network attacks. To identify large-trafficnetwork attacks, after step S230, the method may further include: if thequantities of access data having the same preset fields within each ofthe time periods are different, further determining whether thequantities of access data having the same preset fields within each ofthe time periods are the same and reach a preset threshold.

It is determined that, if the quantities of access data having the samepreset fields within each of the time periods are the same and reach thepreset threshold, the method can further include determining accessrequests corresponding to the access data are large-traffic networkattacks.

In some embodiments, the preset threshold may be an empirical value. Thepreset threshold may be set to match the feature of a large number ofaccesses during a large-traffic network attack. For example, the presetthreshold is set to 10,000. For the content shown in the Example 4, thepreset threshold can be 10,000 and the number of accesses of the IPaddress “1.1.1.4” within the third time period can be 12,000. Then, itcan determined whether the number of accesses reaches the presetthreshold of 10,000. As only a number of accesses of the IP address“1.1.1.4” within the third time period reaches the preset threshold,step S250 can be performed. That is, access requests including the IPaddress of “1.1.1.4” are determined as large-traffic network attacks.

In some embodiments, although small-traffic network attacks show a fixedattack frequency, the numbers of accesses of some small-traffic networkattacks within different time periods may have a slight deviation, asshown in the following Example 5.

Within the first time period: the number of accesses of the IP address“1.1.1.1” is 8

Within the second time period: the number of accesses of the IP address“1.1.1.1” is 7

Within the third time period: the number of accesses of the IP address“1.1.1.1” is 8

Within the fourth time period: the number of accesses of the IP address“1.1.1.1” is 8

Within the fifth time period: the number of accesses of the IP address“1.1.1.1” is 9

Within the sixth time period: the number of accesses of the IP address“1.1.1.1” is 8

EXAMPLE 5

In the above embodiments, the numbers of accesses of the IP address“1.1.1.1” are not exactly the same within the first time period to thesixth time period. However, such a situation where the numbers ofaccesses may have a slight deviation may be determined to belong tosmall-traffic network attacks. The tasks from the IP address “1.1.1.1”cannot be identified as the small-traffic network attacks, because thequantities of access data having the same preset fields within each ofthe time periods are not the same.

Embodiments of the disclosure further provide a method for identifyingnetwork attacks based on the above method. FIG. 4 is a flowchart of amethod for identifying network attacks, according to embodiments of thepresent application. The method can include the following stepsS300-S330.

In step S300, access data of a target website server within at least twotime periods can be acquired according to preset fields. This step canbe the same or similar to step S200, the description of which is omittedherein.

In step S310, the quantity of access data having the same content ineach of the preset fields within each of the time periods can becounted. This step can be the same or similar to step S210, thedescription of which is omitted herein.

In step S320, a maximum value and a minimum value of the quantities canbe determined. For the content shown in Example 5, a maximum value ofthe quantities of access data having the same content in the presetfield (e.g., the IP address) within the six time periods can bedetermined as 9 and a minimum value I as 7.

In step S321, a difference between the maximum value and the minimumvalue can be determined. For example, a difference between the maximumvalue of 9 and the minimum value of 7 is 2.

In step S322, it can be determined whether the difference is less than apreset value. If the difference is less than the preset value, step S330can be performed.

In some embodiments, the preset value may be empirical. For example, thepreset value can be two. Therefore, if the determined difference isgreater than the preset value (e.g., two), it can be determined thataccess requests having the IP address of “1.1.1.1” are not small-trafficnetwork attacks. If the determined difference is less than or equal tothe preset value, step S330 can be performed.

In step S330, it can be determined that access requests havingdetermined difference being less than or equal to the preset value aresmall-traffic network attacks.

In some embodiments of the present application, step S300 may furtherinclude collecting an access log of the target website server. This stepcan be the same or similar to step S201, the description of which isomitted herein.

Step S300 may further include acquiring access data within at least twotime periods from the access log according to the preset fields. Thisstep can be same or similar as step S302, the description of which isomitted herein.

A larger amount of more accurate access data can be acquired accordingto the collected access log. The obtained access data can facilitate thesystem to determine, according to the features of the small-trafficnetwork attacks, whether the numbers of accesses of the same piece ofaccess data within a preset number of time periods are the same. If thenumbers of accesses of the same piece of access data within a presetnumber of time periods are the same, the features of the small-trafficnetwork attacks are matched, and it can be determined that accessrequests including the same piece of access data are small-trafficnetwork attacks.

A large amount of access data is generally recorded in an access log ofa website server. After the system collects the access log, the accessdata will occupy lots of system memory if not stored in a database. As aresult, the running efficiency of the system can be lowered. In someembodiments, after the step of acquiring access data within at least twotime periods from the access log according to the preset fields, themethod may further include: storing the access data within the at leasttwo time periods in a database. Correspondingly, the step of countingthe quantity of access data having the same content in each of thepreset fields within each of the time periods includes: querying thedatabase for access data within at least two time periods and countingthe quantity of access data having the same content in each of thepreset fields within each of the time periods.

In some embodiments, in addition to identifying small-traffic networkattacks, the system may also deal with large-traffic network attacks. Toidentify large-traffic network attacks, the method may further include,after step S330, if the determined difference is not less than thepreset value, determining whether the quantities of access data havingthe same preset fields within each of the time periods are the same andreach a preset threshold.

If the quantities of access data having the same preset fields withineach of the time periods are the same and reach the preset threshold,access requests corresponding to the access data can be determined aslarge-traffic network attacks.

In some embodiments, the preset threshold may be empirical. The presetthreshold may be set to match the feature of a large number of accessesduring a large-traffic network attack. For example, the preset thresholdis set to 10,000. For the content shown in Example 4, the presetthreshold can be 10,000 and the number of accesses of the IP address“1.1.1.4” within the third time period is 12,000. Then, it can bedetermined whether the number of accesses reaches the preset thresholdof 10,000. As only the number of accesses of the IP address “1.1.1.4”within the third time period reaches the preset threshold, accessrequests including the IP address “1.1.1.4” can be determined aslarge-traffic network attacks.

To solve the problem shown in Example 5, FIG. 5 is a flowchart of amethod for identifying network attacks, according to some embodiments ofthe present application. The method can include S400-S430.

In step S400, access data within at least two time periods of a targetwebsite server can be acquired according to preset fields.

This step can be same or similar as step S200, the description of whichis omitted herein.

In step S410, the quantity of access data having the same content ineach of the preset fields within each of the time periods can becounted.

This step can be same or similar as step S210, the description of whichis omitted herein.

In step S420, an average value of the quantities can be determined.

For the content shown in Example 5, it is determined that an averagevalue of the quantities (e.g., 8, 7, 8, 8, 9, 8) is 8.

In step S421, a difference between each of the quantities and theaverage value can be determined.

A difference between each of the quantities and the average value can bedetermined as below.

A difference within the first time period is 0

A difference within the first time period is 1

A difference within the first time period is 0

A difference within the first time period is 0

A difference within the first time period is 1

A difference within the first time period is 0

In step S422, it is determined whether each difference is less than apreset value. If each difference is less than the preset value, stepS430 can be performed.

In the above example, if the preset value is 0.5, each difference is notless than the preset value. Thus it can be determined that accessrequests including the IP address “1.1.1.1” are not small-trafficnetwork attacks. In configurations where the preset value is greaterthan 1, each difference would be less than the preset value, and stepS430 would be performed.

In step S430, it is determined that, if each difference is less than thepreset value, access requests corresponding to the access data can bedetermined as small-traffic network attacks.

In some embodiments of the present application, step S400 may furtherinclude: collecting an access log of the target website server. Thisstep can be same or similar as step S201, description of which isomitted herein.

Step S400 may further include acquiring access data within at least twotime periods from the access log according to the preset fields. Thisstep can be the same or similar to step S202, description of which isomitted herein.

A larger amount of accurate access data can be acquired according to thecollected access log. The access data obtained can facilitate the systemto determine, according to the features of the small-traffic networkattacks, whether the numbers of accesses of the same piece of accessdata within a preset number of time periods are the same. If the numbersof accesses of the same piece of access data within a preset number oftime periods are the same, the features of the small-traffic networkattacks are matched, and it can be determined that access requestsincluding the same piece of access data are small-traffic networkattacks.

A large amount of access data is generally recorded in an access log ofa website server. After the system collects the access log, the accessdata can occupy lots of system memory if not stored in a database. As aresult, the running efficiency of the system can be lowered. To solvethe foregoing problem, in embodiments of the application, after the stepof acquiring access data within at least two time periods from theaccess log according to the preset fields, the method may furtherinclude: storing the access data within the at least two time periods ina database. Correspondingly, the step of counting the quantity of accessdata having the same content in each of the preset fields within each ofthe time periods includes: querying the database for access data withinat least two time periods and counting the quantity of access datahaving the same content in each of the preset fields within each of thetime periods.

In some embodiments, in addition to identifying small-traffic networkattacks, the system may also deal with large-traffic network attacks. Toidentify large-traffic network attacks, the method may further include,after step S430, if each difference is not less than a presetdifference, determining whether the quantities of access data having thesame preset fields within each of the time periods are the same andreach a preset threshold.

If the quantities of access data having the same preset fields withineach of the time periods are the same and reach the preset threshold,the method may further include determining access requests correspondingto the access data as large-traffic network attacks.

In some embodiments, the preset threshold may be an artificially setempirical value. The preset threshold may be set to match the feature ofa large number of accesses during a large-traffic network attack. Forexample, the preset threshold is set to 10,000. For the content shown inExample 4, the preset threshold can be 10,000 and the number of accessesof the IP address “1.1.1.4” within the third time period is 12,000.Then, it can be determined whether the number of accesses reaches thepreset threshold of 10,000. As only the number of accesses of the IPaddress “1.1.1.4” within the third time period reaches the presetthreshold, access requests including the IP address “1.1.1.4” can bedetermined as large-traffic network attacks.

Embodiments of the present application further provide an apparatusimplementing the steps of the method. The apparatus can be implementedby software, and can also be implemented by hardware or in a manner ofcombining software and hardware. By taking the software implementationas an example, an apparatus in a logical sense is formed by a CentralProcess Unit (CPU) of a server reading a corresponding computer programinstruction into a memory for running.

FIG. 6 illustrates a schematic diagram of a website anti-attackapparatus according to embodiments of the present application. Theapparatus can include units 500-520.

An acquisition unit 500 can be configured to acquire access data withinat least two time periods of a target website server, wherein the accessdata include one or more fields.

A first determination unit 510 can be configured to determine, for eachof the at least two time periods, a quantity of access data having asame content in the at least some of the one or more fields.

A second determination unit 520 can be configured to determine whetherthe quantities of access data having the same preset fields within eachof the time periods are the same, and in response to the quantities ofaccess data being the same, determine that at least some access requestsof the access data are small-traffic network attacks.

Acquisition unit 500 may further include: a first acquisition subunitconfigured to collect an access log of the target website server; and asecond acquisition subunit configured to acquire access data within atleast two time periods from the access log according to the presetfields.

The first acquisition subunit can be further configured to collect anaccess log of a most front-end application of the target website server.

The apparatus further includes: a storage unit configured to store theaccess data within the at least two time periods in a database.Correspondingly, first determination unit 510 can be further configuredto query the database for access data within at least two time periods,and count the quantity of access data having the same content in each ofthe preset fields within each of the time periods.

The time periods include adjacent time periods.

The preset fields include at least one of an IP address, a host, a useragent, and a URL.

Embodiments of the disclosure further provide an apparatus foridentifying network attacks. The apparatus can include: an acquisitionunit configured to acquire access data for at least two time periods ofa target website server, wherein the access data includes one or morefields; a first determination unit configured to determine, for each ofthe at least two time periods, a quantity of access data having a samecontent in the at least some of the one or more fields; a secondacquisition unit configured to acquire a maximum value and a minimumvalue of the quantities; a third determination unit configured todetermine a difference between the maximum value and the minimum value;and a fourth determination unit configured to determine access requestscorresponding to the access data as the small-traffic network based onthe determined difference.

The acquisition unit can further include a first acquisition subunitconfigured to collect an access log of the target website server; and asecond acquisition subunit configured to acquire the access data withinat least two time periods from the access log.

The first acquisition subunit can be further configured to collect anaccess log of a front-end application of the target website server.

The apparatus can further include a storage unit configured to store theaccess data within the at least two time periods in a database, and thefirst determination unit can be further configured to query the databasefor access data within at least two time periods, and count the quantityof access data having the same content in each of the preset fieldswithin each of the time periods.

Embodiments of the disclosure can further provide an apparatus foridentifying network attacks. The apparatus can include an acquisitionunit configured to acquire access data for at least two time periods ofa target website server, wherein the access data includes one or morefields; a first determination unit configured to determine, for each ofthe at least two time periods, a quantity of access data having a samecontent in the at least some of the one or more fields; an averagedetermination unit configured to determine an average value of theaccess data having the same content in at least some of the one or morefields; and a fifth determination unit configured to determine adifference between the quantities of access data in each of the at leasttwo periods and the average value; determine whether the difference isless than a preset threshold; and determine access requestscorresponding to the access data are the small-traffic network attacksbased on the difference.

The acquisition unit can further include: a first acquisition subunitconfigured to collect an access log of the target website server; and asecond acquisition subunit configured to acquire the access data withinat least two time periods from the access log.

The first acquisition subunit is further configured to collect an accesslog of a front-end application of the target website server.

The apparatus can further include a storage unit configured to store theaccess data within the at least two time periods in a database, and thefirst determination unit can be further configured to query the databasefor access data within at least two time periods, and count the quantityof access data having the same content in each of the preset fieldswithin each of the time periods.

The time periods comprise adjacent time periods. The fields can includeat least one of an Internet Protocol (IP) address, a domain name thataccesses the target website, a browser that accesses the target website,and an Uniform Resource Locator (URL).

In the 1990s, an improvement on a technology may be obviouslydistinguished as an improvement on hardware (for example, an improvementon a circuit structure such as a diode, a transistor, and a switch) oran improvement on software (an improvement on a method procedure).However, with the development of technologies, improvements of manymethod procedures at present may be considered as direct improvements onhardware circuit structures. Almost all designers program improvedmethod procedures into hardware circuits to obtain correspondinghardware circuit structures. Therefore, it is improper to assume thatthe improvement of a method procedure cannot be implemented by using ahardware entity module. For example, a Programmable Logic Device (PLD)(such as a Field Programmable Gate Array (FPGA)) is such an integratedcircuit whose logic functions are determined by devices programmed by auser. Designers program by themselves to “integrate” a digital systeminto a piece of PLD, without the need to ask a chip manufacturer todesign and manufacture a dedicated integrated circuit chip. Moreover, atpresent, the programming is mostly implemented by using “logic compiler”software, instead of manually manufacturing an integrated circuit chip.The logic compiler software is similar to a software complier used fordeveloping and writing a program, and original codes before compilingalso are written by using a specific programming language, which isreferred to as a Hardware Description Language (HDL). There are manytypes of HDLs, such as Advanced Boolean Expression Language (ABEL),Altera Hardware Description Language (AHDL), Confluence, CornellUniversity Programming Language (CUPL), HDCal, Java Hardware DescriptionLanguage (JHDL), Lava, Lola, MyHDL, PALASM, and Ruby HardwareDescription Language (RHDL), among which Very-High-Speed IntegratedCircuit Hardware Description Language (VHDL) and Verilog are mostcommonly used now. Those skilled in the art also should know that ahardware circuit for implementing the logic method procedure may beeasily obtained only by slightly logically programming the methodprocedure using the above several hardware description languages andprogramming it into an integrated circuit. Accordingly, it isappreciated that the embodiments described herein can involve hardware,software, or a combination thereof

A controller may be implemented in any suitable manner. For example, thecontroller may be in the form of a microprocessor or a processor and acomputer readable medium storing a computer readable program code (forexample, software or firmware) executable by the (micro)processor, alogic gate, a switch, an Application Specific Integrated Circuit (ASIC),a programmable logic controller, and an embedded micro-controller.Examples of the controller include, but are not limited to, thefollowing micro-controllers: ARC 625D, Atmel AT91SAM, MicrochipPIC18F26K20, and Silicone Labs C8051F320. A memory controller may alsobe implemented as a part of control logic of a memory. It is appreciatedthat the controller can be implemented by using pure computer readableprogram codes, and in addition, the method steps may be logicallyprogrammed to enable the controller to implement the same function inthe form of a logic gate, a switch, an ASIC, a programmable logiccontroller, and an embedded microcontroller. Therefore, this type ofcontroller may be considered as a hardware component, and apparatusesincluded in the controller for implementing various functions may alsobe considered as structures inside the hardware component. Or, theapparatuses used for implementing various functions may even beconsidered as both software modules for implementing the method andstructures inside the hardware component.

The system, apparatus, module or unit illustrated in the aboveembodiments may be implemented by using a computer chip or an entity, ora product having a certain function.

For ease of description, when the apparatus is described, it is dividedinto various units based on functions for respective descriptions. Whenthe present application is implemented, the functions of the units maybe implemented in the same or multiple software and/or hardware.

It is appreciated that the embodiments of the present invention may beprovided as a method, a system, or a computer program product.Therefore, the present invention may be implemented as a completehardware embodiment, a complete software embodiment, or an embodimentcombining software and hardware. Moreover, the present invention may bea computer program product implemented on one or more computer usablestorage media (including, but not limited to, a magnetic disk memory, aCD-ROM, an optical memory, and the like) including computer usableprogram codes.

The present disclosure is described with reference to flowcharts and/orblock diagrams of the method, device (system) and computer programproduct according to the embodiments of the present invention. It shouldbe understood that a computer program instruction may be used toimplement each process and/or block in the flowcharts and/or blockdiagrams and combinations of processes and/or blocks in the flowchartsand/or block diagrams. These computer program instructions may beprovided to a general-purpose computer, a special-purpose computer, anembedded processor or a processor of any other programmable dataprocessing device to generate a machine, such that the instructionsexecuted by a computer or a processor of any other programmable dataprocessing device generate an apparatus for implementing a functionspecified in one or more processes in the flowcharts and/or in one ormore blocks in the block diagrams.

These computer program instructions may also be stored in a computerreadable storage that can instruct the computer or any otherprogrammable data processing device to work in a particular manner, suchthat the instructions stored in the computer readable storage generatean artifact that includes an instruction apparatus. The instructionapparatus implements a function specified in one or more processes inthe flowcharts and/or in one or more blocks in the block diagrams.

These computer program instructions may also be loaded onto a computeror another programmable data processing terminal device, such that aseries of operation steps are performed on the computer or anotherprogrammable device, thereby generating computer-implemented processing.Therefore, the instructions executed on the computer or anotherprogrammable terminal device provide steps for implementing a functionspecified in one or more processes in the flowcharts and/or in one ormore blocks in the block diagrams.

In a typical configuration, a computation device includes one or morecentral processing units (CPU), an input/output interface, a networkinterface, and a memory.

The memory may include computer readable media such as a volatilememory, a Random Access Memory (RAM), and/or non-volatile memory, e.g.,Read-Only Memory (ROM) or flash RAM. The memory is an example of acomputer readable medium.

The computer readable medium includes non-volatile and volatile media aswell as movable and non-movable media and can implement informationstorage by means of any method or technology. Information may be acomputer readable instruction, a data structure, and a module of aprogram or other data. An example of a storage medium of a computerincludes, but is not limited to, a phase change memory (PRAM), a staticrandom access memory (SRAM), a dynamic random access memory (DRAM),other types of RAMs, a ROM, an electrically erasable programmableread-only memory (EEPROM), a flash memory or other memory technologies,a compact disk read-only memory (CD-ROM), a digital versatile disc (DVD)or other optical storages, a cassette tape, a magnetic tape/magneticdisk storage or other magnetic storage devices, or any othernon-transmission medium, and can be used to store information accessibleby the computing device. According to the definition in this text, thecomputer readable medium does not include transitory media, such as amodulated data signal and a carrier.

It should be further noted that, the terms “include”, “comprise” ortheir other variations are intended to cover non-exclusive inclusion, sothat a process, method, commodity or device including a series ofelements not only includes the elements, but also includes otherelements not clearly listed, or further includes inherent elements ofthe process, method, commodity or device. In a case without any morelimitations, an element defined by “including a/an . . . ” does notexclude that the process, method, commodity or device including theelement further has other identical elements.

Those skilled in the art should understand that the embodiments of thepresent application may be provided as a method, a system, or a computerprogram product. Therefore, the present application may be implementedas a complete hardware embodiment, a complete software embodiment, or anembodiment combining software and hardware. Moreover, the presentapplication may be in the form of a computer program product implementedon one or more computer usable storage media (including, but not limitedto, a magnetic disk memory, a CD-ROM, an optical memory and the like)including computer usable program codes.

The present application may be described in a common context of acomputer executable instruction executed by a computer, for example, aprogram module. Generally, the program module includes a routine, aprogram, an object, an assembly, a data structure, and the like used forexecuting a specific task or implementing a specific abstract data type.The present application may also be implemented in distributed computingenvironments, and in the distributed computer environments, a task isexecuted by using remote processing devices connected through acommunications network. In the distributed computer environments, theprogram module may be located in local and remote computer storage mediaincluding a storage device.

The embodiments in the specification are described progressively,identical or similar parts of the embodiments may be obtained withreference to each other, and each embodiment emphasizes a part differentfrom other embodiments. The system embodiment can be similar to themethod embodiment, so it is described simply. For related parts,reference may be made to the descriptions of the parts in the methodembodiment.

The above descriptions are merely embodiments of the presentapplication, and are not intended to limit the present application. Itis appreciated that, the present application may have variousmodifications and variations. Any modification, equivalent replacement,improvement or the like made without departing from the spirit andprinciple of the present application should all fall within the scope ofclaims of the present application.

1-27. (canceled)
 28. A method for identifying network attacks,comprising: acquiring a plurality of access data sets for each of atleast two time periods of a target website server, each of the pluralityof access data sets including one or more fields; determining, for eachof the at least two time periods, a quantity of access data sets havinga same field of the one or more fields; and determining that at leasttwo access requests of the plurality of access data sets are networkattacks based on at least one of: quantities of the access data setshaving the same field of the one or more fields within each of the timeperiods being the same, or a difference between a maximum value and aminimum value of quantities of access data sets having the same field inthe at least two time periods, or a difference between the quantity ofthe access data sets in each of the at least two time periods and anaverage value of the quantities of the access data sets having the samefield in the at least two time periods.
 29. The method of claim 28,wherein acquiring the plurality of access data sets for each of the atleast two time periods of the target website server further comprises:collecting an access log of the target website server; and acquiring theaccess data sets within the at least two time periods from the accesslog.
 30. The method of claim 29, wherein collecting the access log ofthe target website server further comprises: collecting an access log ofa front-end application of the target website server.
 31. The method ofclaim 29, wherein after acquiring the plurality of access data sets foreach of the at least two time periods from the access logs, the methodfurther comprises: storing the plurality of access data sets within theat least two time periods in a database, and wherein determining, foreach of the at least two time periods, the quantity of access data setshaving the same field of the one or more fields further comprises:querying the database for the access data sets within the at least twotime periods, and counting the quantity of the access data sets havingthe same field for each of the at least two time periods.
 32. The methodof claim 28, wherein determining that the at least two access requestsof the plurality of access data sets are the network attacks furthercomprises: determining whether the quantities of the access data setshaving the same field for the at least two time periods are the same;and in response to the quantities of the access data sets having thesame field being the same, determining that the at least two accessrequests of the plurality of access data sets are the network attacks.33. The method of claim 28, wherein determining that the at least twoaccess requests of the plurality of access data sets are the networkattacks further comprises: acquiring the maximum value and the minimumvalue of the quantities of the access data sets having the same field;determining the difference between the maximum value and the minimumvalue; and determining the at least two access requests of the pluralityof access data sets as the network attacks in response to adetermination that the difference is less than a preset threshold. 34.The method of claim 28, wherein determining that the at least two accessrequests of the plurality of access data sets are the network attacksfurther comprises: determining the average value of the quantities ofthe access data sets having the same field; determining the differencebetween the quantity of the access data sets in each of the at least twotime periods and the average value; and determining that the at leasttwo access requests of the plurality of access data sets are the networkattacks in response to a determination that the difference is less thana preset threshold.
 35. The method of claim 28, wherein the at least twotime periods comprise adjacent time periods.
 36. The method of claim 28,wherein the one or more fields comprise at least one of an InternetProtocol (IP) address, a domain name that accesses the target website, abrowser that accesses the target website, or an Uniform Resource Locator(URL).
 37. An apparatus for identifying network attacks, comprising: amemory device storing instructions; and a processor arranged to executethe instructions to cause the apparatus to: acquire a plurality ofaccess data sets for at least two time periods of a target websiteserver, each of the plurality of access data sets including one or morefields; determine, for each of the at least two time periods, a quantityof access data sets having a same field of the one or more fields; anddetermine that at least two access requests of the plurality of accessdata sets are network attacks based on at least one of: quantities ofthe access data sets having the same field of the one or more fieldswithin each of the time periods being the same, or a difference betweena maximum value and a minimum value of quantities of access data setshaving the same field in the at least two time periods, or a differencebetween the quantity of the access data sets in each of the at least twotime periods and an average value of the quantities of the access datasets having the same field in the at least two time periods.
 38. Theapparatus of claim 37, wherein the processor is arranged to execute theinstructions to cause the apparatus to: collect an access log of thetarget website server; and acquire the access data sets within the atleast two time periods from the access log.
 39. The apparatus of claim38, wherein the processor is arranged to execute the instructions tocause the apparatus to collect an access log of a front-end applicationof the target website server.
 40. The apparatus of claim 38, wherein theprocessor is arranged to execute the instructions to cause the apparatusto: store the plurality of access data sets within the at least two timeperiods in a database, and query the database for the access data setswithin the at least two time periods; and count the quantity of theaccess data sets having the same field for each of the at least two timeperiods.
 41. The apparatus of claim 37, wherein the processor isarranged to execute the instructions to cause the apparatus to:determine whether the quantities of the access data sets having the samefield for the at least two time periods are the same, and in response tothe quantities of the access data sets having the same field being thesame, determine that the at least two access requests of the pluralityof access data sets are the network attacks.
 42. The apparatus of claim37, wherein the processor is arranged to execute the instructions tocause the apparatus to: acquire the maximum value and the minimum valueof the quantities of the access data sets having the same field;determine the difference between the maximum value and the minimumvalue; and determine the at least two access requests of the pluralityof access data sets as the network attacks in response to adetermination that the difference is less than a preset threshold. 43.The apparatus of claim 37, wherein the processor is arranged to executethe instructions to cause the apparatus to: determine the average valueof the quantities of the access data sets having the same field;determine the difference between the quantity of the access data sets ineach of the at least two periods and the average value; determinewhether the difference is less than a threshold; and determine that theat least two access requests of the plurality of access data sets arethe network attacks in response to a determination that the differenceis less than the threshold.
 44. The apparatus of claim 37, wherein theat least two time periods comprise adjacent time periods.
 45. Theapparatus of claim 37, wherein the one or more fields comprise at leastone of an Internet Protocol (IP) address, a domain name that accessesthe target website, a browser that accesses the target website, or anUniform Resource Locator (URL).
 46. A non-transitory computer readablemedium that stores a set of instructions that is executable by at leastone processor of an electronic device to cause the device to perform amethod for identifying network attacks, the method comprising: acquiringa plurality of access data sets for each of at least two time periods ofa target website server, each of the plurality of access data setsincluding one or more fields; determining, for each of the at least twotime periods, a quantity of access data sets having a same field of theone or more fields; and determining that at least two access requests ofthe plurality of access data sets are network attacks based on at leastone of: quantities of the access data sets having the same field of theone or more fields within each of the time periods being the same, or adifference between a maximum value and a minimum value of quantities ofaccess data sets having the same field in the at least two time periods,or a difference between the quantity of the access data sets in each ofthe at least two time periods and an average value of the quantities ofthe access data sets having the same field in the at least two timeperiods.
 47. The non-transitory computer readable medium of claim 46,wherein acquiring the plurality of access data sets for each of the atleast two time periods of the target website server further comprises:collecting an access log of the target website server; and acquiring theaccess data sets within the at least two time periods from the accesslog.
 48. The non-transitory computer readable medium of claim 47,wherein collecting the access log of the target website server furthercomprises: collecting an access log of a front-end application of thetarget website server.
 49. The non-transitory computer readable mediumof claim 47, wherein after acquiring the plurality of access data setsfor each of the at least two time periods from the access logs, the setof instructions is further executable by the at least one processor ofthe electronic device to perform: storing the plurality of access datasets within the at least two time periods in a database, and whereindetermining, for each of the at least two time periods, the quantity ofaccess data sets having the same field of the one or more fields furthercomprises: querying the database for the access data sets within the atleast two time periods, and counting the quantity of the access datasets having the same field for each of the at least two time periods.50. The non-transitory computer readable medium of claim 46, whereindetermining the at least two access requests of the plurality of accessdata sets are the network attacks further comprises: determining whetherthe quantities of the access data sets having the same field for the atleast two time periods are the same; and in response to the quantitiesof the access data sets having the same field being the same,determining that the at least two access requests of the plurality ofaccess data sets are the network attacks.
 51. The non-transitorycomputer readable medium of claim 46, wherein determining that the atleast two access requests of the plurality of access data sets are thenetwork attacks further comprises: acquiring the maximum value and theminimum value of the quantities of the access data sets having the samefield; determining the difference between the maximum value and theminimum value; and determining the at least two access requests of theplurality of access data sets as the network attacks in response to adetermination that the difference is less than a preset threshold. 52.The non-transitory computer readable medium of claim 46, whereindetermining that the at least two access requests of the plurality ofaccess data sets are the network attacks further comprises: determiningthe average value of the quantities of the access data sets having thesame field; determining the difference between the quantity of theaccess data sets in each of the at least two time periods and theaverage value; and determining that the at least two access requests ofthe plurality of access data sets are the network attacks in response toa determination that the difference is less than a preset threshold. 53.The non-transitory computer readable medium of claim 46, wherein the atleast two time periods comprise adjacent time periods.
 54. Thenon-transitory computer readable medium of claim 46, wherein the fieldscomprise at least one of an Internet Protocol (IP) address, a domainname that accesses the target website, a browser that accesses thetarget website, or an Uniform Resource Locator (URL).